Introduction

This a template for an analysis notebook using RMarkdown.

In this notebook, we will perform the Seurat workflow [normalization –> reduction –> clustering] and explore the clustering results.

This correspond to the step 2 of the proposed analysis: clustering of cells across a set of parameters for few samples

In order to assess the clustering quality, we look into some marker genes, pathways enrichment and label transfer. This approach would provide us a rapid idea of the quality of the clustering:

  • in the case each cluster do have a set of specific marker genes, we could expect each of the cluster to have at least a different phenotype,
  • if marker genes are mostly shared between clusters, we might have overclustered,
  • if no marker genes are found, the quality of cells in the clusters might be impaired (high mitochondrial content? ribosomal content?)

Description of the analysis workflow

We perform the following analysis to assess for the quality of clustering:

[1] We perform some quality check to assess any QC-induced clustering (nFeature, nCount, percent.mito).

[2] We add cell cycle information, as we know that in a specific cell cycle state, the transcriptional program is mostly/exclusively related to cell cycle genes and the identity of cells is difficult to determine. We expect these cells to cluster together in a cluster of proliferating cells.

[3] We look at specific marker genes that we reported in the table marker.sets/CellType_metadata.csv to check the relevance of the clustering.

[4] We look at specific pathways to check the relevance of the clustering.

[5] We run DElegate::FindAllMarkers2 to find markers of the different clusters and manually check if they do make sense. DElegate::FindAllMarkers2 is an improved version of Seurat::FindAllMarkers based on pseudobulk differential expression method.

[6] We perform enrichment analysis of marker genes for each Seurat clusters. We defined all the genes from the Seurat object as the universe and used the MSigDB gene sets.

[7] We plot pca/umap reduction grouping with available annotations (singler_, cellassign_). We expect at least immune cells to be correctly label and fall into a few set of clusters.

[ ] Next step, aim for a next PR: We will transfer annotations from two human fetal (kidney) references (Stewart et al., Cao et al.). We plot pca/umap reduction grouping with latest labels. We expect it to be the most representative of the cell types in the sample.

Packages

Load required packages in the following chunk, if needed. Do not install packages here; only load them with the library() function.

library(SingleCellExperiment)

library("Seurat")

#library(Azimuth)            # Will be required once we add the labl transfer from Azimuth fetal reference
#library(SeuratData)         # Will be required once we add the labl transfer from Azimuth fetal reference

library(viridis)            # for plotting with viridis colors
library(SCpubr)             # for plotting 
library(tidyverse)
library(patchwork)

library(msigdbr)
library(enrichplot)
library(clusterProfiler)

library(org.Hs.eg.db)


set.seed(params$seed)

Base directories

# The base path for the OpenScPCA repository, found by its (hidden) .git directory
repository_base <- rprojroot::find_root(rprojroot::is_git_root)

# The current data directory, found within the repository base directory
data_dir <- file.path(repository_base, "data", params$current, params$scpca_project_id)

# The path to this module
module_base <- file.path(repository_base, "analyses", "cell-type-wilms-tumor-06")

Selected parameters

Please note: to keep the notebook as straight as possible, we decided to show the analysis for the selected set of parameters:

  • SCTransform (default parameters)
  • RunPCA (default)
  • RunUMAP (dims = 1:50)
  • FindNeighbors (dims = 1:50)
  • FindClusters (default)

Note: Other parameters have been previously tested, but we would like to show in the following report that the one selected is performing good.

Functions

Here we defined function that will be used multiple time all along the notebook.

Visualize seurat clusters and markers genes

For a Seurat object objectand a features features, the function visualize_feature will plot FeaturePlot and ViolinPlot

  • object is the Seurat object

  • features the gene or quantitative value to be plotted

  • group.by is the metadata used for grouping the violin plots

visualize_feature <- function(object, features, group.by = "seurat_clusters"){
 
                  feature_symbol = AnnotationDbi::select(org.Hs.eg.db, 
                                keys=features, 
                                columns="SYMBOL", 
                                keytype="ENSEMBL")

 
                  d <- SCpubr::do_FeaturePlot(object, 
                                              features = feature_symbol$ENSEMBL, 
                                              pt.size = 0.2, 
                                              legend.width = 0.5, 
                                              legend.length = 5, 
                                              legend.position = "right") + ggtitle(feature_symbol$SYMBOL)

                  b <- SCpubr::do_ViolinPlot(srat, 
                                             features = feature_symbol$ENSEMBL, 
                                             ncol = 1, 
                                             group.by = group.by, 
                                             legend.position = "none") + ylab(feature_symbol$SYMBOL)
                   
                  return(d + b + plot_layout(ncol = 2, widths = c(2,4))) 

}

Visualize seurat clusters and metadata

For a Seurat object objectand a metadata metadata, the function visualize_metadata will plot FeaturePlot and BarPlot

  • object is the Seurat object

  • metadata the gene or quantitative value to be plotted

  • group.by is the metadata used for grouping the violin plots

visualize_metadata <- function(object, meta, group.by){
  
  if(class(object@meta.data[,meta]) == "numeric"){
     d <- SCpubr::do_FeaturePlot(object, 
                                              features = meta, 
                                              pt.size = 0.2, 
                                              legend.width = 0.5, 
                                              legend.length = 5, 
                                              legend.position = "right") + ggtitle(meta)

    b <- SCpubr::do_ViolinPlot(srat, 
                                             features = meta, 
                                             ncol = 1, 
                                             group.by = group.by, 
                                             legend.position = "none")
                   
   return(d + b + plot_layout(ncol = 2, widths = c(2,4))) 
  }
  
  
  else{
    
  
  d <- SCpubr::do_DimPlot(object, reduction="umap", group.by = group.by, label = TRUE, repel = TRUE) + ggtitle(paste0(meta," - umap")) + theme(text=element_text(size=18))

  b <- SCpubr::do_BarPlot(sample = object,
                         group.by = meta,
                         split.by = group.by,
                         position = "fill",
                         font.size = 10,
                         legend.ncol = 3) +
                         ggtitle("% cells")+
                         xlab(file) +
                         theme(text=element_text(size=18))
  return(d + b + plot_layout(ncol = 2, widths = c(2,4)))
  }
  
}

Calculate ModuleScore from a MSigDB dataset

For a Seurat object object, the function MSigDB_score will calculate a score using AddModuleScore() function for a MSigDB gene set gs_name in the category category

  • object is the Seurat object to calculate the score. Score will be added in the metadata of this object.

  • category is a MSigDB collection (https://www.gsea-msigdb.org/gsea/msigdb/collections.jsp). values in c("H", "C1", "C2", "C3", "C4", "C5", "C6", "C7", "C8")

  • gs_name is the name of a MSigDB gene set, e.g. "HALLMARK_P53_PATHWAY"

  • name is the name of the module

MSigDB_score <- function(object, category, gs_name, name){
  
        set <- msigdbr(species = "human", category = category)

        set_list <- set %>% 
                    dplyr::filter(gs_name == gs_name) %>%
                    dplyr::distinct(gs_name, gene_symbol, human_ensembl_gene) %>% as.data.frame()

        set_list <- list(set_list$human_ensembl_gene)

        object <- AddModuleScore(object, features = set_list, name = name)

return(object)
}

Enrichment analysis

Enrichment_plotaim to perform enrichment of the marker genes for each of the Seurat clusters and summarize the results in a dotplot.

  • category is the MSigDB category or collection to be used values in c("H", "C1", "C2", "C3", "C4", "C5", "C6", "C7", "C8")

  • signatures is a list of marker genes per cluster

  • backgroundis the universe used for enrichment

Enrichment_plot <- function(category, signatures, background){

    ## define genesets
    gene_set <- msigdbr(species = "human", category = category)
    msigdbr_set <-  gene_set %>% dplyr::distinct(gs_name, ensembl_gene) %>% as.data.frame()

    cclust<-compareCluster(geneCluster = signatures, 
               fun = enricher,
               TERM2GENE = msigdbr_set,
               universe=background)
    d <- dotplot(cclust,showCategory=15) + scale_y_discrete(labels=function(x) str_wrap(x, width=40))
    return(d)  
}

Analysis content

Load the processed SingleCellExperiment rds object

filelist <- list.files(data_dir, full.names = TRUE)

# select the 40 Wilms tumor single nucleus folders only
filedir <- filelist[grepl(params$sample_id, filelist)]

file <- list.files(filedir)
file <- file[grepl("_processed.rds", file)]

# open the processed rds object
sce <- readRDS(paste0(filedir, "/", file))

Run Seurat pipeline

# convert to seurat
srat <- CreateSeuratObject(counts = counts(sce),
                                    assay = "RNA",
                                    project = params$sample_id
                           )

# convert colData and rowData to data.frame for use in the Seurat object
cell_metadata <- as.data.frame(colData(sce))
row_metadata <- as.data.frame(rowData(sce))

# add cell metadata (colData) from SingleCellExperiment to Seurat
srat@meta.data <- cell_metadata

# add row metadata (rowData) from SingleCellExperiment to Seurat
srat[["RNA"]]@meta.data <- row_metadata

# add metadata from SingleCellExperiment to Seurat
srat@misc <- metadata(sce)
rm(sce)

# Normalization
options(future.globals.maxSize= 8912896000000)
srat <- SCTransform(srat, verbose = F, conserve.memory = TRUE)

# dimensionality reduction
srat <- RunPCA(srat, verbose = F)
srat <- RunUMAP(srat, dims = 1:50, verbose = F)

# clustering
srat    <- FindNeighbors(srat, dims = 1:50, verbose = F)
srat    <- FindClusters(srat, verbose = T)
## Modularity Optimizer version 1.3.0 by Ludo Waltman and Nees Jan van Eck
## 
## Number of nodes: 1483
## Number of edges: 70183
## 
## Running Louvain algorithm...
## Maximum modularity in 10 random starts: 0.7563
## Number of communities: 8
## Elapsed time: 0 seconds

Visualize seurat clusters

d2 <- SCpubr::do_DimPlot(srat, reduction="umap", group.by = "seurat_clusters", label = TRUE) + ggtitle("Seurat Cluster - umap")
d1 <- SCpubr::do_DimPlot(srat, reduction="pca", group.by = "seurat_clusters", label = TRUE) + ggtitle("Seurat Cluster - pca")
v <- SCpubr::do_ViolinPlot(srat, features = c( "subsets_mito_percent"), ncol = 1, group.by = "seurat_clusters", legend.position = "none") 

d1 + d2 + v + plot_layout(ncol = 3, widths = c(2,2,4))

We expect up to 5 set of clusters:

  • blastema cancer cells
  • epithelium cancer and/or normal cells
  • stroma cancer and/or normal cells
  • immune cells
  • endothelial cells

Cell cycle information

s.genes <- srat@assays$RNA@meta.data$gene_ids[srat@assays$RNA@meta.data$gene_symbol %in% cc.genes$s.genes]
g2m.genes <- srat@assays$RNA@meta.data$gene_ids[srat@assays$RNA@meta.data$gene_symbol %in% cc.genes$g2m.genes]

srat <- CellCycleScoring(srat, s.features = s.genes, g2m.features = g2m.genes, set.ident = FALSE)
## Warning: The following features are not present in the object: ENSG00000117399, ENSG00000117650, ENSG00000175063, not searching for symbol synonyms
visualize_metadata(srat, meta = "Phase", group.by = "seurat_clusters")

visualize_metadata(srat, meta  = "S.Score", group.by = "seurat_clusters")

visualize_metadata(srat, meta = "G2M.Score", group.by = "seurat_clusters")

Look at specific genes

Here, we open the table of marker genes marker-sets/CellType_metadata.csv. Note: we do not expect to have a clear and nice pattern of expression for all of the following markers in every tumor. This is just ti get a few idea.

CellType_metadata <- read_csv(file.path(module_base, "marker-sets", "CellType_metadata.csv"))

DT::datatable(CellType_metadata, caption = ("CellType_metadata"), 
              extensions = 'Buttons', 
              options = list(  dom = 'Bfrtip',

                               buttons = c( 'csv', 'excel')))
for(feature in CellType_metadata$ENSEMBL_ID){
  if(feature %in% rownames(srat)){
    tmp <-  visualize_feature(srat, features = feature, group.by = "seurat_clusters")
    print(tmp)
  }
}

Look at specific pathways

TP53 pathway

here we will calculate a TP53 score using AddMduleScore and the genes of the HALLMARK_P53_PATHWAY gene set.

srat <- MSigDB_score(object = srat , category = "H", gs_name = "HALLMARK_P53_PATHWAY", name = "TP53_score")
## Warning: The following features are not present in the object: ENSG00000122971, ENSG00000182035, ENSG00000181092, ENSG00000285070, ENSG00000168497, ENSG00000123080, ENSG00000242131,
## ENSG00000273607, ENSG00000176194, ENSG00000142973, ENSG00000275003, ENSG00000261698, ENSG00000285482, ENSG00000282853, ENSG00000170323, ENSG00000211445, ENSG00000152137,
## ENSG00000174697, ENSG00000283887, ENSG00000285121, ENSG00000285387, ENSG00000229314, ENSG00000115255, ENSG00000104918, ENSG00000183048, ENSG00000173267, ENSG00000198142,
## ENSG00000137767, ENSG00000175567, ENSG00000087085, ENSG00000273686, ENSG00000204364, ENSG00000206372, ENSG00000226560, ENSG00000231543, ENSG00000235017, ENSG00000235696,
## ENSG00000042493, ENSG00000164326, ENSG00000172156, ENSG00000181374, ENSG00000102962, ENSG00000275302, ENSG00000275824, ENSG00000277943, ENSG00000271503, ENSG00000274233,
## ENSG00000108688, ENSG00000163823, ENSG00000121807, ENSG00000160791, ENSG00000116824, ENSG00000178562, ENSG00000167286, ENSG00000198851, ENSG00000160654, ENSG00000101017,
## ENSG00000102245, ENSG00000173762, ENSG00000105369, ENSG00000121594, ENSG00000153563, ENSG00000172116, ENSG00000147889, ENSG00000126759, ENSG00000109943, ENSG00000138755,
## ENSG00000186810, ENSG00000288145, ENSG00000197561, ENSG00000277571, ENSG00000102034, ENSG00000124882, ENSG00000180210, ENSG00000117560, ENSG00000072694, ENSG00000140030,
## ENSG00000145649, ENSG00000100453, ENSG00000206505, ENSG00000223980, ENSG00000224320, ENSG00000227715, ENSG00000229215, ENSG00000231834, ENSG00000235657, ENSG00000239463,
## ENSG00000241394, ENSG00000242361, ENSG00000242685, ENSG00000243189, ENSG00000243215, ENSG00000243719, ENSG00000226264, ENSG00000234154, ENSG00000239329, ENSG00000241296,
## ENSG00000241674, ENSG00000242092, ENSG00000242386, ENSG00000206292, ENSG00000230141, ENSG00000231558, ENSG00000232957, ENSG00000232962, ENSG00000235744, ENSG00000239457,
## ENSG00000241106, ENSG00000241386, ENSG00000241910, ENSG00000243496, ENSG00000243612, ENSG00000188296, ENSG00000206305, ENSG00000225890, ENSG00000228284, ENSG00000232062,
## ENSG00000236418, ENSG00000206308, ENSG00000226260, ENSG00000227993, ENSG00000228987, ENSG00000230726, ENSG00000234794, ENSG00000277263, ENSG00000206493, ENSG00000225201,
## ENSG00000229252, ENSG00000230254, ENSG00000233904, ENSG00000236632, ENSG00000204632, ENSG00000206506, ENSG00000230413, ENSG00000233095, ENSG00000235346, ENSG00000235680,
## ENSG00000237216, ENSG00000276051, ENSG00000090339, ENSG00000111537, ENSG00000262795, ENSG00000283934, ENSG00000136634, ENSG00000095752, ENSG00000168811, ENSG00000113302,
## ENSG00000096996, ENSG00000169194, ENSG00000115607, ENSG00000125538, ENSG00000109471, ENSG00000288185, ENSG00000134460, ENSG00000100385, ENSG00000147168, ENSG00000113520,
## ENSG00000136244, ENSG00000145839, ENSG00000163083, ENSG00000137265, ENSG00000276561, ENSG00000113263, ENSG00000167768, ENSG00000182866, ENSG00000128342, ENSG00000204487,
## ENSG00000206437, ENSG00000223448, ENSG00000227507, ENSG00000231314, ENSG00000236237, ENSG00000236925, ENSG00000238114, ENSG00000054219, ENSG00000104814, ENSG00000282928,
## ENSG00000165471, ENSG00000100985, ENSG00000100365, ENSG00000275990, ENSG00000189430, ENSG00000273506, ENSG00000273535, ENSG00000273916, ENSG00000274053, ENSG00000275156,
## ENSG00000275521, ENSG00000275637, ENSG00000275822, ENSG00000276450, ENSG00000277334, ENSG00000277442, ENSG00000277629, ENSG00000277824, ENSG00000278025, ENSG00000278362,
## ENSG00000284113, ENSG00000288651, ENSG00000162711, ENSG00000007171, ENSG00000163737, ENSG00000180644, ENSG00000126583, ENSG00000262418, ENSG00000140986, ENSG00000274005,
## ENSG00000274626, ENSG00000274646, ENSG00000274950, ENSG00000275323, ENSG00000277079, ENSG00000277359, ENSG00000278081, ENSG00000278270, ENSG00000137078, ENSG00000185338,
## ENSG00000206297, ENSG00000224212, ENSG00000224748, ENSG00000226173, ENSG00000227816, ENSG00000230705, ENSG00000232367, ENSG00000206235, ENSG00000206299, ENSG00000223481,
## ENSG00000225967, ENSG00000228582, ENSG00000232326, ENSG00000237599, ENSG00000112493, ENSG00000206208, ENSG00000206281, ENSG00000236490, ENSG00000174130, ENSG00000204490,
## ENSG00000206439, ENSG00000223952, ENSG00000228321, ENSG00000228849, ENSG00000228978, ENSG00000230108, ENSG00000232810, ENSG00000163519, ENSG00000115085, ENSG00000160862,
## ENSG00000186480, ENSG00000167751, ENSG00000142515, ENSG00000167034, ENSG00000124664, ENSG00000091583, ENSG00000124875, ENSG00000163132, ENSG00000008438, ENSG00000186652,
## ENSG00000196154, ENSG00000188488, ENSG00000178726, ENSG00000109072, ENSG00000143632, ENSG00000159251, ENSG00000163017, ENSG00000248746, ENSG00000282844, ENSG00000282230,
## ENSG00000275199, ENSG00000179593, ENSG00000181754, ENSG00000139211, ENSG00000105409, ENSG00000162706, ENSG00000172137, ENSG00000282830, ENSG00000090659, ENSG00000280641,
## ENSG00000137197, ENSG00000204539, ENSG00000206460, ENSG00000237114, ENSG00000237123, ENSG00000237165, ENSG00000184113, ENSG00000184697, ENSG00000288292, ENSG00000156284,
## ENSG00000213937, ENSG00000065618, ENSG00000130545, ENSG00000206486, ENSG00000226171, ENSG00000227222, ENSG00000231377, ENSG00000233049, ENSG00000233418, ENSG00000233561,
## ENSG00000152592, ENSG00000134765, ENSG00000134762, ENSG00000128591, ENSG00000276536, ENSG00000108622, ENSG00000105371, ENSG00000105376, ENSG00000094796, ENSG00000262993,
## ENSG00000196878, ENSG00000143217, ENSG00000283859, ENSG00000197893, ENSG00000206315, ENSG00000224952, ENSG00000225987, ENSG00000232005, ENSG00000236353, ENSG00000237344,
## ENSG00000156453, ENSG00000172367, ENSG00000284792, ENSG00000128340, ENSG00000126458, ENSG00000277401, ENSG00000126562, ENSG00000285443, ENSG00000107485, ENSG00000124466,
## ENSG00000162631, ENSG00000169174, ENSG00000140519, ENSG00000185924, ENSG00000197891, ENSG00000288174, ENSG00000198569, ENSG00000183773, ENSG00000166148, ENSG00000137875,
## ENSG00000125845, ENSG00000281484, ENSG00000137752, ENSG00000105974, ENSG00000133101, ENSG00000170458, ENSG00000110848, ENSG00000124762, ENSG00000179388, ENSG00000099860,
## ENSG00000060558, ENSG00000132518, ENSG00000100292, ENSG00000206377, ENSG00000206478, ENSG00000227231, ENSG00000230128, ENSG00000235030, ENSG00000237155, ENSG00000171855,
## ENSG00000167779, ENSG00000172183, ENSG00000100285, ENSG00000117600, ENSG00000141682, ENSG00000284099, ENSG00000172482, ENSG00000118137, ENSG00000103569, ENSG00000138135,
## ENSG00000167910, ENSG00000145321, ENSG00000124713, ENSG00000101323, ENSG00000025423, ENSG00000203857, ENSG00000099377, ENSG00000105610, ENSG00000131910, ENSG00000144852,
## ENSG00000143171, ENSG00000170099, ENSG00000277405, ENSG00000276130, ENSG00000167780, ENSG00000088002, ENSG00000118271, ENSG00000109107, ENSG00000161921, ENSG00000284967,
## ENSG00000281289, ENSG00000185813, ENSG00000130208, ENSG00000234906, ENSG00000110245, ENSG00000173372, ENSG00000288512, ENSG00000157131, ENSG00000021852, ENSG00000176919,
## ASMPATCHG00000001303, ENSG00000262875, ENSG00000204359, ENSG00000239754, ENSG00000241253, ENSG00000241534, ENSG00000242335, ENSG00000243570, ENSG00000197766, ENSG00000274619,
## ENSG00000105664, ENSG00000120054, ENSG00000285132, ENSG00000196188, ENSG00000263238, ENSG00000080166, ENSG00000275932, ENSG00000088926, ENSG00000131187, ENSG00000143278,
## ENSG00000164220, ENSG00000101981, ENSG00000171560, ENSG00000171557, ENSG00000185245, ENSG00000169704, ENSG00000134240, ENSG00000113905, ENSG00000055957, ENSG00000143768,
## ENSG00000112818, ENSG00000189374, ENSG00000196611, ENSG00000166670, ENSG00000275365, ENSG00000102996, ENSG00000149968, ENSG00000118113, ENSG00000169860, ENSG00000100311,
## ENSG00000115956, ENSG00000126231, ENSG00000183155, ENSG00000160678, ENSG00000197249, ENSG00000277377, ENSG00000197632, ENSG00000117601, ENSG00000283100, ENSG00000105825,
## ENSG00000187045, ENSG00000274286, ENSG00000110244, ENSG00000159189, ENSG00000123843, ENSG00000137757, ENSG00000158825, ENSG00000176749, ENSG00000172216, ENSG00000117322,
## ENSG00000163739, ENSG00000138166, ENSG00000110047, ENSG00000057593, ENSG00000158869, ENSG00000085265, ENSG00000167083, ENSG00000113088, ENSG00000116983, ENSG00000215328,
## ENSG00000234475, ENSG00000235941, ENSG00000237724, ENSG00000167748, ENSG00000012223, ENSG00000262406, ENSG00000137745, ENSG00000262325, ENSG00000198736, ENSG00000087250,
## ENSG00000206312, ENSG0
visualize_metadata(srat, meta = "TP53_score1", group.by = "seurat_clusters" )

“Normal” cells (immune, endothelial) have a slightly higher TP53 score.

DNA repair pathway

here we will calculate a DNA_repair score using AddMduleScore and the genes of the HALLMARK_DNA_REPAIR gene set.

srat <- MSigDB_score(object = srat , category = "H", gs_name = "HALLMARK_DNA_REPAIR", name = "DNA_repair_score")
## Warning: The following features are not present in the object: ENSG00000122971, ENSG00000182035, ENSG00000181092, ENSG00000285070, ENSG00000168497, ENSG00000123080, ENSG00000242131,
## ENSG00000273607, ENSG00000176194, ENSG00000142973, ENSG00000275003, ENSG00000261698, ENSG00000285482, ENSG00000282853, ENSG00000170323, ENSG00000211445, ENSG00000152137,
## ENSG00000174697, ENSG00000283887, ENSG00000285121, ENSG00000285387, ENSG00000229314, ENSG00000115255, ENSG00000104918, ENSG00000183048, ENSG00000173267, ENSG00000198142,
## ENSG00000137767, ENSG00000175567, ENSG00000087085, ENSG00000273686, ENSG00000204364, ENSG00000206372, ENSG00000226560, ENSG00000231543, ENSG00000235017, ENSG00000235696,
## ENSG00000042493, ENSG00000164326, ENSG00000172156, ENSG00000181374, ENSG00000102962, ENSG00000275302, ENSG00000275824, ENSG00000277943, ENSG00000271503, ENSG00000274233,
## ENSG00000108688, ENSG00000163823, ENSG00000121807, ENSG00000160791, ENSG00000116824, ENSG00000178562, ENSG00000167286, ENSG00000198851, ENSG00000160654, ENSG00000101017,
## ENSG00000102245, ENSG00000173762, ENSG00000105369, ENSG00000121594, ENSG00000153563, ENSG00000172116, ENSG00000147889, ENSG00000126759, ENSG00000109943, ENSG00000138755,
## ENSG00000186810, ENSG00000288145, ENSG00000197561, ENSG00000277571, ENSG00000102034, ENSG00000124882, ENSG00000180210, ENSG00000117560, ENSG00000072694, ENSG00000140030,
## ENSG00000145649, ENSG00000100453, ENSG00000206505, ENSG00000223980, ENSG00000224320, ENSG00000227715, ENSG00000229215, ENSG00000231834, ENSG00000235657, ENSG00000239463,
## ENSG00000241394, ENSG00000242361, ENSG00000242685, ENSG00000243189, ENSG00000243215, ENSG00000243719, ENSG00000226264, ENSG00000234154, ENSG00000239329, ENSG00000241296,
## ENSG00000241674, ENSG00000242092, ENSG00000242386, ENSG00000206292, ENSG00000230141, ENSG00000231558, ENSG00000232957, ENSG00000232962, ENSG00000235744, ENSG00000239457,
## ENSG00000241106, ENSG00000241386, ENSG00000241910, ENSG00000243496, ENSG00000243612, ENSG00000188296, ENSG00000206305, ENSG00000225890, ENSG00000228284, ENSG00000232062,
## ENSG00000236418, ENSG00000206308, ENSG00000226260, ENSG00000227993, ENSG00000228987, ENSG00000230726, ENSG00000234794, ENSG00000277263, ENSG00000206493, ENSG00000225201,
## ENSG00000229252, ENSG00000230254, ENSG00000233904, ENSG00000236632, ENSG00000204632, ENSG00000206506, ENSG00000230413, ENSG00000233095, ENSG00000235346, ENSG00000235680,
## ENSG00000237216, ENSG00000276051, ENSG00000090339, ENSG00000111537, ENSG00000262795, ENSG00000283934, ENSG00000136634, ENSG00000095752, ENSG00000168811, ENSG00000113302,
## ENSG00000096996, ENSG00000169194, ENSG00000115607, ENSG00000125538, ENSG00000109471, ENSG00000288185, ENSG00000134460, ENSG00000100385, ENSG00000147168, ENSG00000113520,
## ENSG00000136244, ENSG00000145839, ENSG00000163083, ENSG00000137265, ENSG00000276561, ENSG00000113263, ENSG00000167768, ENSG00000182866, ENSG00000128342, ENSG00000204487,
## ENSG00000206437, ENSG00000223448, ENSG00000227507, ENSG00000231314, ENSG00000236237, ENSG00000236925, ENSG00000238114, ENSG00000054219, ENSG00000104814, ENSG00000282928,
## ENSG00000165471, ENSG00000100985, ENSG00000100365, ENSG00000275990, ENSG00000189430, ENSG00000273506, ENSG00000273535, ENSG00000273916, ENSG00000274053, ENSG00000275156,
## ENSG00000275521, ENSG00000275637, ENSG00000275822, ENSG00000276450, ENSG00000277334, ENSG00000277442, ENSG00000277629, ENSG00000277824, ENSG00000278025, ENSG00000278362,
## ENSG00000284113, ENSG00000288651, ENSG00000162711, ENSG00000007171, ENSG00000163737, ENSG00000180644, ENSG00000126583, ENSG00000262418, ENSG00000140986, ENSG00000274005,
## ENSG00000274626, ENSG00000274646, ENSG00000274950, ENSG00000275323, ENSG00000277079, ENSG00000277359, ENSG00000278081, ENSG00000278270, ENSG00000137078, ENSG00000185338,
## ENSG00000206297, ENSG00000224212, ENSG00000224748, ENSG00000226173, ENSG00000227816, ENSG00000230705, ENSG00000232367, ENSG00000206235, ENSG00000206299, ENSG00000223481,
## ENSG00000225967, ENSG00000228582, ENSG00000232326, ENSG00000237599, ENSG00000112493, ENSG00000206208, ENSG00000206281, ENSG00000236490, ENSG00000174130, ENSG00000204490,
## ENSG00000206439, ENSG00000223952, ENSG00000228321, ENSG00000228849, ENSG00000228978, ENSG00000230108, ENSG00000232810, ENSG00000163519, ENSG00000115085, ENSG00000160862,
## ENSG00000186480, ENSG00000167751, ENSG00000142515, ENSG00000167034, ENSG00000124664, ENSG00000091583, ENSG00000124875, ENSG00000163132, ENSG00000008438, ENSG00000186652,
## ENSG00000196154, ENSG00000188488, ENSG00000178726, ENSG00000109072, ENSG00000143632, ENSG00000159251, ENSG00000163017, ENSG00000248746, ENSG00000282844, ENSG00000282230,
## ENSG00000275199, ENSG00000179593, ENSG00000181754, ENSG00000139211, ENSG00000105409, ENSG00000162706, ENSG00000172137, ENSG00000282830, ENSG00000090659, ENSG00000280641,
## ENSG00000137197, ENSG00000204539, ENSG00000206460, ENSG00000237114, ENSG00000237123, ENSG00000237165, ENSG00000184113, ENSG00000184697, ENSG00000288292, ENSG00000156284,
## ENSG00000213937, ENSG00000065618, ENSG00000130545, ENSG00000206486, ENSG00000226171, ENSG00000227222, ENSG00000231377, ENSG00000233049, ENSG00000233418, ENSG00000233561,
## ENSG00000152592, ENSG00000134765, ENSG00000134762, ENSG00000128591, ENSG00000276536, ENSG00000108622, ENSG00000105371, ENSG00000105376, ENSG00000094796, ENSG00000262993,
## ENSG00000196878, ENSG00000143217, ENSG00000283859, ENSG00000197893, ENSG00000206315, ENSG00000224952, ENSG00000225987, ENSG00000232005, ENSG00000236353, ENSG00000237344,
## ENSG00000156453, ENSG00000172367, ENSG00000284792, ENSG00000128340, ENSG00000126458, ENSG00000277401, ENSG00000126562, ENSG00000285443, ENSG00000107485, ENSG00000124466,
## ENSG00000162631, ENSG00000169174, ENSG00000140519, ENSG00000185924, ENSG00000197891, ENSG00000288174, ENSG00000198569, ENSG00000183773, ENSG00000166148, ENSG00000137875,
## ENSG00000125845, ENSG00000281484, ENSG00000137752, ENSG00000105974, ENSG00000133101, ENSG00000170458, ENSG00000110848, ENSG00000124762, ENSG00000179388, ENSG00000099860,
## ENSG00000060558, ENSG00000132518, ENSG00000100292, ENSG00000206377, ENSG00000206478, ENSG00000227231, ENSG00000230128, ENSG00000235030, ENSG00000237155, ENSG00000171855,
## ENSG00000167779, ENSG00000172183, ENSG00000100285, ENSG00000117600, ENSG00000141682, ENSG00000284099, ENSG00000172482, ENSG00000118137, ENSG00000103569, ENSG00000138135,
## ENSG00000167910, ENSG00000145321, ENSG00000124713, ENSG00000101323, ENSG00000025423, ENSG00000203857, ENSG00000099377, ENSG00000105610, ENSG00000131910, ENSG00000144852,
## ENSG00000143171, ENSG00000170099, ENSG00000277405, ENSG00000276130, ENSG00000167780, ENSG00000088002, ENSG00000118271, ENSG00000109107, ENSG00000161921, ENSG00000284967,
## ENSG00000281289, ENSG00000185813, ENSG00000130208, ENSG00000234906, ENSG00000110245, ENSG00000173372, ENSG00000288512, ENSG00000157131, ENSG00000021852, ENSG00000176919,
## ASMPATCHG00000001303, ENSG00000262875, ENSG00000204359, ENSG00000239754, ENSG00000241253, ENSG00000241534, ENSG00000242335, ENSG00000243570, ENSG00000197766, ENSG00000274619,
## ENSG00000105664, ENSG00000120054, ENSG00000285132, ENSG00000196188, ENSG00000263238, ENSG00000080166, ENSG00000275932, ENSG00000088926, ENSG00000131187, ENSG00000143278,
## ENSG00000164220, ENSG00000101981, ENSG00000171560, ENSG00000171557, ENSG00000185245, ENSG00000169704, ENSG00000134240, ENSG00000113905, ENSG00000055957, ENSG00000143768,
## ENSG00000112818, ENSG00000189374, ENSG00000196611, ENSG00000166670, ENSG00000275365, ENSG00000102996, ENSG00000149968, ENSG00000118113, ENSG00000169860, ENSG00000100311,
## ENSG00000115956, ENSG00000126231, ENSG00000183155, ENSG00000160678, ENSG00000197249, ENSG00000277377, ENSG00000197632, ENSG00000117601, ENSG00000283100, ENSG00000105825,
## ENSG00000187045, ENSG00000274286, ENSG00000110244, ENSG00000159189, ENSG00000123843, ENSG00000137757, ENSG00000158825, ENSG00000176749, ENSG00000172216, ENSG00000117322,
## ENSG00000163739, ENSG00000138166, ENSG00000110047, ENSG00000057593, ENSG00000158869, ENSG00000085265, ENSG00000167083, ENSG00000113088, ENSG00000116983, ENSG00000215328,
## ENSG00000234475, ENSG00000235941, ENSG00000237724, ENSG00000167748, ENSG00000012223, ENSG00000262406, ENSG00000137745, ENSG00000262325, ENSG00000198736, ENSG00000087250,
## ENSG00000206312, ENSG0
visualize_metadata(srat, meta = "DNA_repair_score1", group.by = "seurat_clusters" )

Note: Chemo-treated samples should have higher DNA-damage scores.

DROSHA target genes

srat <- MSigDB_score(object = srat , category = "C3", gs_name = "DROSHA_TARGET_GENES", name = "DROSHA_score")
## Warning: The following features are not present in the object: ENSG00000282844, ENSG00000282230, ENSG00000165566, ENSG00000285479, ENSG00000245848, ENSG00000092345, ENSG00000067048,
## ENSG00000110047, ENSG00000156466, ENSG00000184408, ENSG00000171017, ENSG00000168490, ENSG00000164093, ENSG00000274089, ENSG00000134207, ENSG00000206203, ENSG00000125084,
## ENSG00000159640, ENSG00000110200, ENSG00000167117, ENSG00000086159, ENSG00000144119, ENSG00000261893, ENSG00000277354, ENSG00000134765, ENSG00000281320, ENSG00000196361,
## ENSG00000183312, ENSG00000188719, ENSG00000136750, ENSG00000131095, ENSG00000138622, ENSG00000106031, ENSG00000180509, ENSG00000138379, ENSG00000278372, ENSG00000164600,
## ENSG00000177551, ENSG00000122584, ENSG00000119547, ENSG00000179270, ENSG00000138650, ENSG00000151615, ENSG00000198208, ENSG00000177098, ENSG00000276371, ENSG00000282739,
## ENSG00000178235, ENSG00000167941, ENSG00000164458, ENSG00000173452, ENSG00000288130, ENSG00000002745, ENSG00000166188, ENSG00000275176, ENSG00000141433, ENSG00000285070,
## ENSG00000134812, ENSG00000163081, ENSG00000077279, ENSG00000197921, ENSG00000273529, ENSG00000116990, ENSG00000270885, ENSG00000285281, ENSG00000165300, ENSG00000011347,
## ENSG00000130598, ENSG00000288219, ENSG00000206495, ENSG00000224994, ENSG00000226437, ENSG00000229929, ENSG00000230308, ENSG00000232839, ENSG00000174963, ENSG00000162551,
## ENSG00000280759, ENSG00000281385, ENSG00000008300, ENSG00000116254, ENSG00000288307, ENSG00000139549, ENSG00000274429, ENSG00000275482, ENSG00000152822, ENSG00000154118,
## ENSG00000177272, ENSG00000144063, ENSG00000181965, ENSG00000171246, ENSG00000184486, ENSG00000215397, ENSG00000104888, ENSG00000170855, ENSG00000275650, ENSG00000276537,
## ENSG00000276887, ENSG00000179774, ENSG00000185742, ENSG00000006116, ENSG00000170458, ENSG00000273622, ENSG00000273633, ENSG00000278444, ENSG00000284796, ENSG00000284227,
## ENSG00000137094, ENSG00000157851, ASMPATCHG00000000675, ENSG00000262102, ENSG00000102034, ENSG00000126882, ENSG00000132446, ENSG00000107485, ENSG00000188394, ENSG00000158055,
## ENSG00000165478, ENSG00000152804, ENSG00000266265, ENSG00000183640, ENSG00000178457, ENSG00000275851, ENSG00000149968, ENSG00000203624, ENSG00000223775, ENSG00000226111,
## ENSG00000227420, ENSG00000229861, ENSG00000233813, ENSG00000125414, ENSG00000130558, ENSG00000197244, ENSG00000205927, ENSG00000165588, ENSG00000125813, ENSG00000007372,
## ENSG00000175426, ENSG00000109132, ENSG00000152192, ENSG00000206489, ENSG00000227804, ENSG00000230995, ENSG00000231737, ENSG00000235291, ENSG00000238104, ENSG00000263310,
## ENSG00000277015, ENSG00000130766, ENSG00000285069, ENSG00000164438, ENSG00000184730, ENSG00000167580, ENSG00000004848, ENSG00000005981, ENSG00000018625, ENSG00000285390,
## ENSG00000174672, ENSG00000170279, ENSG00000118729, ENSG00000147869, ENSG00000168539, ENSG00000184113, ENSG00000147003, ENSG00000196167, ENSG00000147571, ENSG00000118231,
## ENSG00000285434, ENSG00000206395, ENSG00000225635, ENSG00000226634, ENSG00000227317, ENSG00000228128, ENSG00000233076, ENSG00000152670, ENSG00000284807, ENSG00000137090,
## ENSG00000227802, ENSG00000179813, ENSG00000133477, ENSG00000179639, ENSG00000129514, ENSG00000176165, ENSG00000184481, ENSG00000134363, ENSG00000162676, ENSG00000275099,
## ENSG00000189433, ENSG00000186417, ENSG00000164604, ENSG00000111291, ENSG00000099377, ENSG00000105371, ENSG00000175189, ENSG00000237941, ENSG00000278855, ENSG00000257702,
## ENSG00000273574, ENSG00000274495, ENSG00000278312, ENSG00000143355, ENSG00000176659, ENSG00000177363, ENSG00000283780, ENSG00000278268, ENSG00000111046, ENSG00000173376,
## ENSG00000117650, ENSG00000285485, ENSG00000182379, ENSG00000196071, ENSG00000100311, ENSG00000225553, ENSG00000239756, ENSG00000276358, ENSG00000277111, ENSG00000102007,
## ENSG00000183395, ENSG00000168967, ENSG00000110777, ENSG00000126583, ENSG00000163421, ENSG00000152292, ENSG00000225697, ENSG00000004939, ENSG00000125285, ENSG00000113739,
## ENSG00000179002, ENSG00000236876, ENSG00000230043, ENSG00000187653, ENSG00000206258, ENSG00000229341, ENSG00000229353, ENSG00000231608, ENSG00000233323, ENSG00000236221,
## ENSG00000236236, ENSG00000161911, ENSG00000184108, ENSG00000235217, ENSG00000146469, ENSG00000152977, ENSG00000170684, ENSG00000171443, ENSG00000143373, ENSG00000172238,
## ENSG00000275544, ENSG00000129226, ENSG00000226492, ENSG00000122877, ENSG00000153266, ENSG00000125740, ENSG00000125798, ENSG00000257008, ENSG00000134317, ENSG00000133937,
## ENSG00000169840, ENSG00000123407, ENSG00000126803, ENSG00000135312, ENSG00000173404, ENSG00000137265, ENSG00000133124, ENSG00000173826, ENSG00000182591, ENSG00000186860,
## ENSG00000263341, ENSG00000108231, ENSG00000181541, ENSG00000168530, ASMPATCHG00000001304, ENSG00000182950, ENSG00000078589, ENSG00000188582, ENSG00000231989, ENSG00000265203,
## ENSG00000168476, ENSG00000196717, ENSG00000198377, ENSG00000206282, ENSG00000224841, ENSG00000228736, ENSG00000237825, ENSG00000243978, ENSG00000104112, ENSG00000185985,
## ENSG00000260873, ENSG00000182968, ENSG00000134595, ENSG00000163071, ENSG00000035720, ENSG00000124659, ENSG00000277401, ENSG00000230359, ENSG00000274334, ENSG00000180305,
## ENSG00000180205, ENSG00000156925, ENSG00000163492, ENSG00000118402, ENSG00000181656, ENSG00000016082, ENSG00000178695, ENSG00000166963, ENSG00000122691, ENSG00000279389,
## ENSG00000282735, ENSG00000203784, ENSG00000162711, ENSG00000112299, ENSG00000276016, ENSG00000278741, ENSG00000136881, ENSG00000276559, ENSG00000188848, ENSG00000186897,
## ENSG00000153923, ENSG00000006377, ENSG00000143590, ENSG00000185182, ENSG00000278429, ENSG00000153684, ENSG00000276896, ENSG00000278310, ENSG00000183629, ENSG00000273651,
## ENSG00000277090, ENSG00000163083, ENSG00000274305, ENSG00000276458, ENSG00000276681, ENSG00000274780, ENSG00000284017, ENSG00000285152, ENSG00000169218, ENSG00000263290,
## ENSG00000072041, ENSG00000184564, ENSG00000275272, ENSG00000157766, ENSG00000144476, ENSG00000283802, ENSG00000153294, ENSG00000148926, ENSG00000169252, ENSG00000275199,
## ENSG00000139211, ENSG00000224309, ENSG00000276497, ENSG00000147256, ENSG00000164122, ENSG00000198049, ENSG00000143032, ENSG00000181004, ENSG00000140379, ENSG00000180828,
## ENSG00000112175, ENSG00000102239, ENSG00000118903, ENSG00000100314, ENSG00000102001, ENSG00000166862, ENSG00000104327, ENSG00000172137, ENSG00000282830, ENSG00000070808,
## ENSG00000164076, ENSG00000077274, ENSG00000164326, ENSG00000168497, ENSG00000054803, ENSG00000253276, ENSG00000163823, ENSG00000102245, ENSG00000110848, ENSG00000179776,
## ENSG00000113100, ENSG00000164885, ENSG00000273777, ENSG00000159409, ENSG00000244414, ENSG00000080910, ENSG00000116785, ENSG00000134389, ENSG00000147434, ENSG00000114737,
## ENSG00000156284, ENSG00000165682, ENSG00000180745, ENSG00000242689, ENSG00000278728, ENSG00000284369, ENSG00000241563, ENSG00000158516, ENSG00000130545, ENSG00000132693,
## ENSG00000163254, ENSG00000285011, ENSG00000108342, ENSG00000180138, ENSG00000144655, ENSG00000095596, ENSG00000003137, ENSG00000164935, ENSG00000280873, ENSG00000185559,
## ENSG00000275555, ENSG00000144355, ENSG00000173253, ENSG00000064218, ENSG00000183784, ENSG00000152591, ENSG00000284103, ENSG00000221818, ENSG00000229715, ENSG00000138798,
## ENSG00000179388, ENSG00000198692, ENSG00000107105, ENSG00000168913, ENSG00000204117, ENSG00000163508, ENSG00000049283, ENSG00000083782, ENSG00000106038, ENSG00000164251,
## ENSG00000170323, ENSG00000197245, ENSG00000116661, ENSG00000196468, ENSG00000268853, ENSG00000118972, ENSG00000175592, ENSG00000171956, ENSG00000187140, ENSG00000049768,
## ENSG00000269002, ENSG00000022355, ENSG00000163285, ENSG00000145321, ENSG00000115263, ENSG00000282972, ENSG00000179600, ENSG00000172209, ENSG00000283812, ENSG00000170775,
## ENSG00000102195, ENSG00000203737, ENSG00000156097, ENSG00000140030, ENSG00000126010, ENSG00000101323, ENSG00000145681, ENSG00000283847, ENSG00000137252, ENSG00000179111,
## ENSG00000136630, ENSG00000135100, ENSG00000282958, ENSG00000105991, ENSG00000172789, ENSG00000101180, ENSG00000135116, ENSG00000283060, ENSG00000171855, ENSG00000283934,
## ENSG00000136634, ENSG00000095752, ENSG00000138684, ENSG00000145839, ENSG00000283510, ENSG00000285248, ENSG00000170604, ENSG00000113430, ENSG00000176842, ENSG00000159387,
## ENSG00000165898,
visualize_metadata(srat, meta = "DROSHA_score1", group.by = "seurat_clusters" )

DICER1 target genes

srat <- MSigDB_score(object = srat , category = "C3", gs_name = "DICER1_TARGET_GENES", name = "DICER1_score")
## Warning: The following features are not present in the object: ENSG00000282844, ENSG00000282230, ENSG00000165566, ENSG00000285479, ENSG00000245848, ENSG00000092345, ENSG00000067048,
## ENSG00000110047, ENSG00000156466, ENSG00000184408, ENSG00000171017, ENSG00000168490, ENSG00000164093, ENSG00000274089, ENSG00000134207, ENSG00000206203, ENSG00000125084,
## ENSG00000159640, ENSG00000110200, ENSG00000167117, ENSG00000086159, ENSG00000144119, ENSG00000261893, ENSG00000277354, ENSG00000134765, ENSG00000281320, ENSG00000196361,
## ENSG00000183312, ENSG00000188719, ENSG00000136750, ENSG00000131095, ENSG00000138622, ENSG00000106031, ENSG00000180509, ENSG00000138379, ENSG00000278372, ENSG00000164600,
## ENSG00000177551, ENSG00000122584, ENSG00000119547, ENSG00000179270, ENSG00000138650, ENSG00000151615, ENSG00000198208, ENSG00000177098, ENSG00000276371, ENSG00000282739,
## ENSG00000178235, ENSG00000167941, ENSG00000164458, ENSG00000173452, ENSG00000288130, ENSG00000002745, ENSG00000166188, ENSG00000275176, ENSG00000141433, ENSG00000285070,
## ENSG00000134812, ENSG00000163081, ENSG00000077279, ENSG00000197921, ENSG00000273529, ENSG00000116990, ENSG00000270885, ENSG00000285281, ENSG00000165300, ENSG00000011347,
## ENSG00000130598, ENSG00000288219, ENSG00000206495, ENSG00000224994, ENSG00000226437, ENSG00000229929, ENSG00000230308, ENSG00000232839, ENSG00000174963, ENSG00000162551,
## ENSG00000280759, ENSG00000281385, ENSG00000008300, ENSG00000116254, ENSG00000288307, ENSG00000139549, ENSG00000274429, ENSG00000275482, ENSG00000152822, ENSG00000154118,
## ENSG00000177272, ENSG00000144063, ENSG00000181965, ENSG00000171246, ENSG00000184486, ENSG00000215397, ENSG00000104888, ENSG00000170855, ENSG00000275650, ENSG00000276537,
## ENSG00000276887, ENSG00000179774, ENSG00000185742, ENSG00000006116, ENSG00000170458, ENSG00000273622, ENSG00000273633, ENSG00000278444, ENSG00000284796, ENSG00000284227,
## ENSG00000137094, ENSG00000157851, ASMPATCHG00000000675, ENSG00000262102, ENSG00000102034, ENSG00000126882, ENSG00000132446, ENSG00000107485, ENSG00000188394, ENSG00000158055,
## ENSG00000165478, ENSG00000152804, ENSG00000266265, ENSG00000183640, ENSG00000178457, ENSG00000275851, ENSG00000149968, ENSG00000203624, ENSG00000223775, ENSG00000226111,
## ENSG00000227420, ENSG00000229861, ENSG00000233813, ENSG00000125414, ENSG00000130558, ENSG00000197244, ENSG00000205927, ENSG00000165588, ENSG00000125813, ENSG00000007372,
## ENSG00000175426, ENSG00000109132, ENSG00000152192, ENSG00000206489, ENSG00000227804, ENSG00000230995, ENSG00000231737, ENSG00000235291, ENSG00000238104, ENSG00000263310,
## ENSG00000277015, ENSG00000130766, ENSG00000285069, ENSG00000164438, ENSG00000184730, ENSG00000167580, ENSG00000004848, ENSG00000005981, ENSG00000018625, ENSG00000285390,
## ENSG00000174672, ENSG00000170279, ENSG00000118729, ENSG00000147869, ENSG00000168539, ENSG00000184113, ENSG00000147003, ENSG00000196167, ENSG00000147571, ENSG00000118231,
## ENSG00000285434, ENSG00000206395, ENSG00000225635, ENSG00000226634, ENSG00000227317, ENSG00000228128, ENSG00000233076, ENSG00000152670, ENSG00000284807, ENSG00000137090,
## ENSG00000227802, ENSG00000179813, ENSG00000133477, ENSG00000179639, ENSG00000129514, ENSG00000176165, ENSG00000184481, ENSG00000134363, ENSG00000162676, ENSG00000275099,
## ENSG00000189433, ENSG00000186417, ENSG00000164604, ENSG00000111291, ENSG00000099377, ENSG00000105371, ENSG00000175189, ENSG00000237941, ENSG00000278855, ENSG00000257702,
## ENSG00000273574, ENSG00000274495, ENSG00000278312, ENSG00000143355, ENSG00000176659, ENSG00000177363, ENSG00000283780, ENSG00000278268, ENSG00000111046, ENSG00000173376,
## ENSG00000117650, ENSG00000285485, ENSG00000182379, ENSG00000196071, ENSG00000100311, ENSG00000225553, ENSG00000239756, ENSG00000276358, ENSG00000277111, ENSG00000102007,
## ENSG00000183395, ENSG00000168967, ENSG00000110777, ENSG00000126583, ENSG00000163421, ENSG00000152292, ENSG00000225697, ENSG00000004939, ENSG00000125285, ENSG00000113739,
## ENSG00000179002, ENSG00000236876, ENSG00000230043, ENSG00000187653, ENSG00000206258, ENSG00000229341, ENSG00000229353, ENSG00000231608, ENSG00000233323, ENSG00000236221,
## ENSG00000236236, ENSG00000161911, ENSG00000184108, ENSG00000235217, ENSG00000146469, ENSG00000152977, ENSG00000170684, ENSG00000171443, ENSG00000143373, ENSG00000172238,
## ENSG00000275544, ENSG00000129226, ENSG00000226492, ENSG00000122877, ENSG00000153266, ENSG00000125740, ENSG00000125798, ENSG00000257008, ENSG00000134317, ENSG00000133937,
## ENSG00000169840, ENSG00000123407, ENSG00000126803, ENSG00000135312, ENSG00000173404, ENSG00000137265, ENSG00000133124, ENSG00000173826, ENSG00000182591, ENSG00000186860,
## ENSG00000263341, ENSG00000108231, ENSG00000181541, ENSG00000168530, ASMPATCHG00000001304, ENSG00000182950, ENSG00000078589, ENSG00000188582, ENSG00000231989, ENSG00000265203,
## ENSG00000168476, ENSG00000196717, ENSG00000198377, ENSG00000206282, ENSG00000224841, ENSG00000228736, ENSG00000237825, ENSG00000243978, ENSG00000104112, ENSG00000185985,
## ENSG00000260873, ENSG00000182968, ENSG00000134595, ENSG00000163071, ENSG00000035720, ENSG00000124659, ENSG00000277401, ENSG00000230359, ENSG00000274334, ENSG00000180305,
## ENSG00000180205, ENSG00000156925, ENSG00000163492, ENSG00000118402, ENSG00000181656, ENSG00000016082, ENSG00000178695, ENSG00000166963, ENSG00000122691, ENSG00000279389,
## ENSG00000282735, ENSG00000203784, ENSG00000162711, ENSG00000112299, ENSG00000276016, ENSG00000278741, ENSG00000136881, ENSG00000276559, ENSG00000188848, ENSG00000186897,
## ENSG00000153923, ENSG00000006377, ENSG00000143590, ENSG00000185182, ENSG00000278429, ENSG00000153684, ENSG00000276896, ENSG00000278310, ENSG00000183629, ENSG00000273651,
## ENSG00000277090, ENSG00000163083, ENSG00000274305, ENSG00000276458, ENSG00000276681, ENSG00000274780, ENSG00000284017, ENSG00000285152, ENSG00000169218, ENSG00000263290,
## ENSG00000072041, ENSG00000184564, ENSG00000275272, ENSG00000157766, ENSG00000144476, ENSG00000283802, ENSG00000153294, ENSG00000148926, ENSG00000169252, ENSG00000275199,
## ENSG00000139211, ENSG00000224309, ENSG00000276497, ENSG00000147256, ENSG00000164122, ENSG00000198049, ENSG00000143032, ENSG00000181004, ENSG00000140379, ENSG00000180828,
## ENSG00000112175, ENSG00000102239, ENSG00000118903, ENSG00000100314, ENSG00000102001, ENSG00000166862, ENSG00000104327, ENSG00000172137, ENSG00000282830, ENSG00000070808,
## ENSG00000164076, ENSG00000077274, ENSG00000164326, ENSG00000168497, ENSG00000054803, ENSG00000253276, ENSG00000163823, ENSG00000102245, ENSG00000110848, ENSG00000179776,
## ENSG00000113100, ENSG00000164885, ENSG00000273777, ENSG00000159409, ENSG00000244414, ENSG00000080910, ENSG00000116785, ENSG00000134389, ENSG00000147434, ENSG00000114737,
## ENSG00000156284, ENSG00000165682, ENSG00000180745, ENSG00000242689, ENSG00000278728, ENSG00000284369, ENSG00000241563, ENSG00000158516, ENSG00000130545, ENSG00000132693,
## ENSG00000163254, ENSG00000285011, ENSG00000108342, ENSG00000180138, ENSG00000144655, ENSG00000095596, ENSG00000003137, ENSG00000164935, ENSG00000280873, ENSG00000185559,
## ENSG00000275555, ENSG00000144355, ENSG00000173253, ENSG00000064218, ENSG00000183784, ENSG00000152591, ENSG00000284103, ENSG00000221818, ENSG00000229715, ENSG00000138798,
## ENSG00000179388, ENSG00000198692, ENSG00000107105, ENSG00000168913, ENSG00000204117, ENSG00000163508, ENSG00000049283, ENSG00000083782, ENSG00000106038, ENSG00000164251,
## ENSG00000170323, ENSG00000197245, ENSG00000116661, ENSG00000196468, ENSG00000268853, ENSG00000118972, ENSG00000175592, ENSG00000171956, ENSG00000187140, ENSG00000049768,
## ENSG00000269002, ENSG00000022355, ENSG00000163285, ENSG00000145321, ENSG00000115263, ENSG00000282972, ENSG00000179600, ENSG00000172209, ENSG00000283812, ENSG00000170775,
## ENSG00000102195, ENSG00000203737, ENSG00000156097, ENSG00000140030, ENSG00000126010, ENSG00000101323, ENSG00000145681, ENSG00000283847, ENSG00000137252, ENSG00000179111,
## ENSG00000136630, ENSG00000135100, ENSG00000282958, ENSG00000105991, ENSG00000172789, ENSG00000101180, ENSG00000135116, ENSG00000283060, ENSG00000171855, ENSG00000283934,
## ENSG00000136634, ENSG00000095752, ENSG00000138684, ENSG00000145839, ENSG00000283510, ENSG00000285248, ENSG00000170604, ENSG00000113430, ENSG00000176842, ENSG00000159387,
## ENSG00000165898,
visualize_metadata(srat, meta = "DICER1_score1", group.by = "seurat_clusters" )

Find marker genes for each of the seurat clusters

In addition to the list of known marker genes, we used an unbiased approach to find transcripts that characterized the different clusters. We run DElegate::FindAllMarkers2 to find markers of the different clusters and manually check if they do make sense. DElegate::FindAllMarkers2 is an improved version of Seurat::FindAllMarkers based on pseudobulk differential expression method. Please check the preprint from Chistoph Hafemeister: https://www.biorxiv.org/content/10.1101/2023.03.28.534443v1 and tool described here: https://github.com/cancerbits/DElegate

Of note, we won’t use it for annotation, this is just here to get an idea!

feature_conversion <- srat@assays$RNA@meta.data
de_results   <- DElegate::FindAllMarkers2(srat, group_column = "seurat_clusters")

#filter the most relevant markers
s.markers <- de_results[de_results$padj < params$padj_thershold & de_results$log_fc > params$lfc_threshold & de_results$rate1 > params$rate1_threshold,]

# add gene symbol for easiest interpretation of the result
s.markers$gene_ids <- s.markers$feature
s.markers <- left_join(s.markers,feature_conversion, by = c( "gene_ids") )
identical(s.markers$feature, s.markers$gene_ids)     # check the quality of the merge, must be true
## [1] TRUE
DT::datatable(s.markers, caption = ("marker genes"), 
              extensions = 'Buttons', 
              options = list(  dom = 'Bfrtip',

                               buttons = c( 'csv', 'excel')))
# Select top 5 genes for heatmap plotting
s.markers <- na.omit(s.markers)
s.markers %>%
    group_by(group1) %>%
    top_n(n =  5, wt = log_fc) -> top5

# subset for plotting
cells <- WhichCells(srat, downsample = 100)
ss <- subset(srat, cells = cells)
ss <- ScaleData(ss, features = top5$feature)

p1 <- SCpubr::do_DimPlot(srat, reduction="umap", group.by = "seurat_clusters", label = TRUE, repel = TRUE) + ggtitle("Seurat Cluster - umap")
p2 <- DoHeatmap(ss, features = top5$feature,  cells = cells, group.by = "seurat_clusters") + NoLegend() + 
  scale_fill_gradientn(colors =  c("#01665e","#35978f",'darkslategray3', "#f7f7f7", "#fee391","#fec44f","#F9AD03")) 
p3 <- ggplot(srat@meta.data, aes(seurat_clusters, fill = seurat_clusters)) + geom_bar() + NoLegend()


common_title <- sprintf("Unsupervised clustering %s, %d cells", srat@meta.data$orig.ident[1], ncol(srat))
show((((p1 / p3) + plot_layout(heights = c(3,2)) | p2) ) + plot_layout(widths = c(1, 2)) + plot_layout(heights = c(3,1)) + plot_annotation(title = common_title))

DT::datatable(top5[, c(1, 9, 11, 12)], caption = ("top 5 marker genes"), 
              extensions = 'Buttons', 
              options = list(  dom = 'Bfrtip',

                               buttons = c( 'csv', 'excel')))

Enrichment analysis of marker genes

Here we perform enrichment analysis of the marker genes found in the previous section for each Seurat cluster.

We defined as universe/background all the genes expressed in the dataset, meaning the rownames of the Seurat object.

We used three gene sets from MSiGDB:

  • the hallmark gene sets are coherently expressed signatures derived by aggregating many MSigDB gene sets to represent well-defined biological states or processes.
  • the C3 : regulatory target gene sets based on gene target predictions for microRNA seed sequences and predicted transcription factor binding sites.
  • the C8 : cell type signature gene sets curated from cluster markers identified in single-cell sequencing studies of human tissue.

We used enricher function from clusterProfiler to perform enrichment analysis.

# define background genes = universe for enrichment
background <-  row_metadata$gene_ids

# Define gene signature per cluster
signatures <- list()
for( i in unique(s.markers$group1)){
  signatures[[paste0("cluster ",i)]] <- s.markers$feature[s.markers$group1 == i]
}

Hallmarks MSiGDB gene sets

Enrichment_plot(category = "H", signatures = signatures, background = background)

EMT signature should be enriched in stroma cluster. E2F/proliferation should be enriched in blastema cluster. MYC(N), TP53 must be enriched in blastema cluster.

C3 MSiGDB regulatory target gene sets

Enrichment_plot(category = "C3", signatures = signatures, background = background)

Here to check of we catch and MIR pattern (DROSHA; DICER1; other?)

C8 MSiGDB cell type signature gene sets

Enrichment_plot(category = "C8", signatures = signatures, background = background)

The MSigDB C8 gene set is quite relevant for kidney and nephroblastoma annotations. Epithelial (cancer and normal) cells should be enriched in mature/adulte kidney pathways while blastema cancer cells will show enrichment of fetal kidney development pathway / cap mesenchyme.

Annotation from SingleR (no cancer or kidney specific dataset)

Here, we quickly checked annotations that are present in the _processed rds object. However, the automated annotation have not been performed using a cancer specific reference or a kidney reference. We do not expect a nice labelling of the cells as the overlap of cell types between the reference and the query dataset is poor. This support the need to do a proper label transfer from the fetal kidney atlas, which is imho the best reference that can be applied to a Wilms tumor query.

visualize_metadata(srat, meta = "singler_celltype_annotation", group.by = "seurat_clusters")

Annotation from cellassign (no cancer or kidney specific dataset)

visualize_metadata(srat, meta = "cellassign_celltype_annotation", group.by = "seurat_clusters")

Even if the result of SingleR and CellAssign are not specific to a Wilms tumor dataset, it give the first impression that C12 is a cluster of endothelial cells and C11 a cluster of immune cells.

Azimuth annotation from fetal kidney

To be added in a next PR. Look at opened PR #706

For more information related to the reference, please go to https://www.kidneycellatlas.org/ You will find:

  • interactive viewer

  • h5ad files to download.

Please note that as Wilms tumor have been described to be closer to fetal kidney as mature kidney, we only used the fetal kidney atlas as the reference. Also check : https://www.science.org/doi/10.1126/science.aat5031

This part is imho one of the most important step that allow us to have a quick and reliable idea of the composition of the different clusters. The predicted compartment are defined into 4 categories:

  • endothelium
  • stroma
  • fetal nephron
  • immune

As for SingleR and CellAssign, the annotation of immune cells and endothelial cells is straightforward. The stroma compartment should then contain normal and cancer stromal cells. The fetal nephron compartment contain blastema cancer cells a well as normal and cancer epithelial cells.

Further segregation of cancer versus normal cells will be achieved using a combination of markers/pathways (see above) and inferred CNV (to be done).

Session Info

# record the versions of the packages used in this analysis and other environment information
sessionInfo()
## R version 4.4.1 (2024-06-14)
## Platform: x86_64-pc-linux-gnu
## Running under: Ubuntu 22.04.4 LTS
## 
## Matrix products: default
## BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3 
## LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.20.so;  LAPACK version 3.10.0
## 
## Random number generation:
##  RNG:     L'Ecuyer-CMRG 
##  Normal:  Inversion 
##  Sample:  Rejection 
##  
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8     LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
##  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                  LC_ADDRESS=C               LC_TELEPHONE=C             LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
## 
## time zone: Europe/Vienna
## tzcode source: system (glibc)
## 
## attached base packages:
## [1] stats4    stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
##  [1] clusterProfiler_4.12.2      enrichplot_1.24.2           data.table_1.15.4           msigdbr_7.5.1               org.Hs.eg.db_3.19.1         AnnotationDbi_1.66.0       
##  [7] BiocManager_1.30.23         viridis_0.6.5               viridisLite_0.4.2           assertthat_0.2.1            remotes_2.5.0               shiny_1.9.1                
## [13] SeuratData_0.2.2.9001       SingleCellExperiment_1.26.0 SummarizedExperiment_1.34.0 Biobase_2.64.0              GenomicRanges_1.56.1        GenomeInfoDb_1.40.1        
## [19] IRanges_2.38.1              S4Vectors_0.42.1            BiocGenerics_0.50.0         MatrixGenerics_1.16.0       matrixStats_1.3.0           patchwork_1.2.0            
## [25] lubridate_1.9.3             forcats_1.0.0               stringr_1.5.1               dplyr_1.1.4                 purrr_1.0.2                 readr_2.1.5                
## [31] tidyr_1.3.1                 tibble_3.2.1                ggplot2_3.5.1               tidyverse_2.0.0             SCpubr_2.0.2                Azimuth_0.5.0              
## [37] shinyBS_0.61.1              sctransform_0.4.1           Seurat_5.1.0                SeuratObject_5.0.2          sp_2.1-4                   
## 
## loaded via a namespace (and not attached):
##   [1] R.methodsS3_1.8.2                 vroom_1.6.5                       poweRlaw_0.80.0                   goftest_1.2-3                     DT_0.33                          
##   [6] Biostrings_2.72.1                 vctrs_0.6.5                       spatstat.random_3.3-1             digest_0.6.36                     png_0.1-8                        
##  [11] ggrepel_0.9.5                     deldir_2.0-4                      parallelly_1.38.0                 renv_1.0.7                        MASS_7.3-61                      
##  [16] Signac_1.13.0                     reshape2_1.4.4                    qvalue_2.36.0                     httpuv_1.6.15                     withr_3.0.1                      
##  [21] ggfun_0.1.5                       xfun_0.46                         survival_3.7-0                    EnsDb.Hsapiens.v86_2.99.0         memoise_2.0.1                    
##  [26] gson_0.1.0                        DElegate_1.2.1                    tidytree_0.4.6                    zoo_1.8-12                        gtools_3.9.5                     
##  [31] pbapply_1.7-2                     R.oo_1.26.0                       KEGGREST_1.44.1                   promises_1.3.0                    httr_1.4.7                       
##  [36] restfulr_0.0.15                   globals_0.16.3                    fitdistrplus_1.2-1                rhdf5filters_1.16.0               rhdf5_2.48.0                     
##  [41] rstudioapi_0.16.0                 DOSE_3.30.2                       UCSC.utils_1.0.0                  miniUI_0.1.1.1                    generics_0.1.3                   
##  [46] babelgene_22.9                    curl_5.2.1                        zlibbioc_1.50.0                   ggraph_2.2.1                      polyclip_1.10-7                  
##  [51] GenomeInfoDbData_1.2.12           SparseArray_1.4.8                 xtable_1.8-4                      pracma_2.4.4                      evaluate_0.24.0                  
##  [56] S4Arrays_1.4.1                    hms_1.1.3                         irlba_2.3.5.1                     colorspace_2.1-1                  hdf5r_1.3.11                     
##  [61] ROCR_1.0-11                       reticulate_1.38.0                 spatstat.data_3.1-2               magrittr_2.0.3                    lmtest_0.9-40                    
##  [66] ggtree_3.12.0                     later_1.3.2                       lattice_0.22-6                    glmGamPoi_1.16.0                  spatstat.geom_3.3-2              
##  [71] future.apply_1.11.2               shadowtext_0.1.4                  scattermore_1.2                   XML_3.99-0.17                     cowplot_1.1.3                    
##  [76] RcppAnnoy_0.0.22                  pillar_1.9.0                      nlme_3.1-165                      pwalign_1.0.0                     caTools_1.18.2                   
##  [81] compiler_4.4.1                    RSpectra_0.16-2                   stringi_1.8.4                     tensor_1.5                        GenomicAlignments_1.40.0         
##  [86] plyr_1.8.9                        crayon_1.5.3                      abind_1.4-5                       BiocIO_1.14.0                     gridGraphics_0.5-1               
##  [91] googledrive_2.1.1                 locfit_1.5-9.10                   graphlayouts_1.1.1                bit_4.0.5                         fastmatch_1.1-4                  
##  [96] codetools_0.2-20                  crosstalk_1.2.1                   bslib_0.8.0                       plotly_4.10.4                     mime_0.12                        
## [101] splines_4.4.1                     Rcpp_1.0.13                       fastDummies_1.7.3                 sparseMatrixStats_1.16.0          HDO.db_0.99.1                    
## [106] cellranger_1.1.0                  knitr_1.48                        blob_1.2.4                        utf8_1.2.4                        seqLogo_1.70.0                   
## [111] AnnotationFilter_1.28.0           fs_1.6.4                          listenv_0.9.1                     DelayedMatrixStats_1.26.0         ggplotify_0.1.2                  
## [116] Matrix_1.7-0                      statmod_1.5.0                     tzdb_0.4.0                        tweenr_2.0.3                      pkgconfig_2.0.3                  
## [121] tools_4.4.1                       cachem_1.1.0                      RSQLite_2.3.7                     DBI_1.2.3                         fastmap_1.2.0                    
## [126] rmarkdown_2.27                    scales_1.3.0                      grid_4.4.1                        ica_1.0-3                         shinydashboard_0.7.2             
## [131] Rsamtools_2.20.0                  sass_0.4.9                        dotCall64_1.1-1                   RANN_2.6.1                        farver_2.1.2                     
## [136] scatterpie_0.2.3                  tidygraph_1.3.1                   yaml_2.3.10                       rtracklayer_1.64.0                cli_3.6.3                        
## [141] leiden_0.4.3.1                    lifecycle_1.0.4                   uwot_0.2.2                        presto_1.0.0                      BSgenome.Hsapiens.UCSC.hg38_1.4.5
## [146] BiocParallel_1.38.0               annotate_1.82.0                   timechange_0.3.0                  gtable_0.3.5                      rjson_0.2.21                     
## [151] ggridges_0.5.6                    progressr_0.14.0                  ape_5.8                           parallel_4.4.1                    limma_3.60.4                     
## [156] jsonlite_1.8.8                    edgeR_4.2.1                       RcppHNSW_0.6.0                    TFBSTools_1.42.0                  bitops_1.0-8                     
## [161] bit64_4.0.5                       Rtsne_0.17                        yulab.utils_0.1.5                 spatstat.utils_3.0-5              CNEr_1.40.0                      
## [166] highr_0.11                        jquerylib_0.1.4                   GOSemSim_2.30.0                   shinyjs_2.1.0                     SeuratDisk_0.0.0.9021            
## [171] spatstat.univar_3.0-0             R.utils_2.12.3                    lazyeval_0.2.2                    htmltools_0.5.8.1                 GO.db_3.19.1                     
## [176] rappdirs_0.3.3                    ensembldb_2.28.0                  glue_1.7.0                        TFMPvalue_0.0.9                   spam_2.10-0                      
## [181] googlesheets4_1.1.1               XVector_0.44.0                    RCurl_1.98-1.16                   treeio_1.28.0                     rprojroot_2.0.4                  
## [186] BSgenome_1.72.0                   gridExtra_2.3                     JASPAR2020_0.99.10                igraph_2.0.3                      R6_2.5.1                         
## [191] RcppRoll_0.3.1                    labeling_0.4.3                    GenomicFeatures_1.56.0            cluster_2.1.6                     Rhdf5lib_1.26.0                  
## [196] gargle_1.5.2                      aplot_0.2.3                       DirichletMultinomial_1.46.0       DelayedArray_0.30.1               tidyselect_1.2.1                 
## [201] ProtGenerics_1.36.0               ggforce_0.4.2                     future_1.34.0                     munsell_0.5.1                     KernSmooth_2.23-24               
## [206] fgsea_1.30.0                      htmlwidgets_1.6.4                 RColorBrewer_1.1-3                rlang_1.1.4                       spatstat.sparse_3.1-0            
## [211] spatstat.explore_3.3-1            fansi_1.0.6